Categories

Versions

Append (Superset) (Operator Toolbox)

Synopsis

This operator provides the same functionality as the Append operator from RapidMiner Core, but the attributes of the ExampleSets to be appended, do not have to be the same, or of the same type.

Description

This operator builds a merged ExampleSet from two or more ExampleSets by adding all examples into a combined ExampleSet. The attributes of the resulting ExampleSet will be a superset of all attributes in all ExampleSets.

If an attribute is not in one of the input ExampleSets, the values for the examples from the input ExampleSet will be set to missing values. If the types of attributes with the same name are different between the input ExampleSets the attribute in the resulting ExampleSet will be a nominal attribute and the values are parsed to nominal values.

If two attributes with different names have the same special role, only the first attribute will keep its role in the resulting ExampleSet. The role of the other attribute will be set to <role>_i with i a counter which is increased until the name of the role is unique (e.g. there are three attributes (names a,b,c) which all have role label. In the resulting ExampleSet, attribute 'a' will have role 'label', 'b' will have role 'label_1' and 'c' will have role 'label_2').

Input

  • example set (Data Table)

    This operator can have multiple ExampleSet inputs. When one ExampleSet is connected, another input port becomes available which is ready to accept another ExampleSet. The order of inputs remains the same. Collections of ExampleSets can also be connected.

Output

  • merged set (Data Table)

    The appended ExampleSet.

Tutorial Processes

Appending Titanic Training and Titanic Data Set

This tutorial process is an example of a simple append of two ExampleSets with different attributes. The Titanic Training and the Titanic data sets have similar attributes. While the Titanic data set contains all 7 attributes of the Titanic Training data set, it also has 5 attributes which do not occur in the other data set. Using the Append (Superset) operator allows to append both data sets together. For the examples from the Titanic Training data set, the values of the attributes which are only in the Titanic data set, will be missing.

Appending three datasets with different types and attribute roles

This tutorial process demonstrate how the Append (Superset) operator appends data sets, which have attributes with different types and attribute roles for different attributes. See the comments in the process for more information.